feat: add top-level kNN search via Query\Knn and Query::setKnn()#2314
feat: add top-level kNN search via Query\Knn and Query::setKnn()#2314Cryde wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughAdds top-level kNN query support: new ChangeskNN Query Feature
Sequence DiagramsequenceDiagram
participant Client
participant Query
participant Knn
participant Elasticsearch
Client->>Query: setKnn(Knn with filters)
Query->>Knn: toArray()
Knn-->>Query: serialized knn parameter(s)
Query->>Query: store at top-level "knn"
Query-->>Client: Query object (fluent)
Client->>Elasticsearch: execute request containing knn
Elasticsearch-->>Client: matching documents
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Query.php`:
- Line 56: TRawQuery's knn union type is too permissive and allows object forms
that can become nested (knn.knn) when serialized; change the TRawQuery['knn']
definition to accept only normalized array shapes (e.g.,
array<string,mixed>|list<array<string,mixed>>) and remove Knn|list<Knn> from
that raw type, or alternatively add a normalization step in the
raw-to-normalized conversion path (the function/method that builds/serializes
TRawQuery) to detect Knn objects and convert them into the flat array form
before serialization; update any references to TRawQuery and the knn handling
code to use the normalized array form so raw queries never contain nested knn
objects.
- Line 511: Update the setKnn method signature to use a native integer type for
its parameter (change setKnn($knn) to setKnn(int $knn): self) and adjust any
callers to pass an int; also update or add PHPDoc on Query::setKnn to reflect
the typed parameter if needed. Ensure the method body and any
assignments/validations inside the setKnn implementation accept and operate on
an int value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8551ea06-cc17-4fcf-90b0-2fdac223ebac
📒 Files selected for processing (4)
CHANGELOG.mdsrc/Query.phpsrc/Query/Knn.phptests/Query/KnnTest.php
3824342 to
c10a45a
Compare
ruflin
left a comment
There was a problem hiding this comment.
Thanks for the contribution. What is your take on the KNN being it's own object not extending Query (as it isn't one 🤔 )
| /** | ||
| * Top-level kNN search. | ||
| * | ||
| * Note: `knn` is a sibling of `query` in the search request body, not a clause inside it. |
There was a problem hiding this comment.
That is the part I stumble over. Should we make it part of Query or make it it's own top level thing extending Parms instead?
There was a problem hiding this comment.
You're right, knn is a sibling of query in the search body.
Nothing today prevents a user from passing a Knn to Query::setQuery() or inside a BoolQuery, where it would silently produce an invalid request.
I'll switch to extending Param indeed. Pushing the change shortly !
| * | ||
| * @see https://www.elastic.co/docs/solutions/search/vector/knn | ||
| */ | ||
| class Knn extends AbstractQuery |
Adds an Elastica\Query\Knn class wrapping Elasticsearch's top-level `knn` search clause, plus Query::setKnn() to attach a single Knn or a list of Knn (multiple kNN searches in a single request) as a sibling of `query`. - Knn supports field, query_vector, k, num_candidates plus optional pre-filters (addFilter), similarity threshold and boost. - Query::toArray() no longer auto-pads with match_all when only knn is set.
c10a45a to
72d0ea0
Compare
|
Also maybe I should move to |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Query.php`:
- Around line 511-517: The setKnn method currently accepts any array and then
maps assuming list<Knn>; update setKnn(Knn|array $knn) to validate array inputs
before mapping: if $knn is an array ensure it is a non-empty list (no
associative keys), and each element is an instance of Knn; if validation fails
throw InvalidException with a clear message like "setKnn expects a non-empty
list of Knn instances"; after validation proceed to map each Knn to its
->toArray()['knn'] value (preserving existing mapping behavior) so only valid
Knn payloads are produced.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1596fbd3-02fa-4d08-87ef-59e30cafa20f
📒 Files selected for processing (4)
CHANGELOG.mdsrc/Query.phpsrc/Query/Knn.phptests/Query/KnnTest.php
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
| public function setKnn(Knn|array $knn): self | ||
| { | ||
| if (\is_array($knn)) { | ||
| $value = \array_map(static fn (Knn $k): array => $k->toArray()['knn'], $knn); | ||
| } else { | ||
| $value = $knn->toArray()['knn']; | ||
| } |
There was a problem hiding this comment.
Validate setKnn() array input before mapping.
Line 513 accepts any array, but the mapper assumes list<Knn>. Empty/non-list/non-Knn entries fail late (or build invalid knn payloads). Add explicit guards and throw InvalidException with a clear message.
Proposed fix
public function setKnn(Knn|array $knn): self
{
if (\is_array($knn)) {
+ if ([] === $knn || !\array_is_list($knn)) {
+ throw new InvalidException('Knn must be a non-empty list of Knn instances.');
+ }
+ foreach ($knn as $entry) {
+ if (!$entry instanceof Knn) {
+ throw new InvalidException('Each knn entry must be an instance of '.Knn::class.'.');
+ }
+ }
$value = \array_map(static fn (Knn $k): array => $k->toArray()['knn'], $knn);
} else {
$value = $knn->toArray()['knn'];
}
return $this->setParam('knn', $value);
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/Query.php` around lines 511 - 517, The setKnn method currently accepts
any array and then maps assuming list<Knn>; update setKnn(Knn|array $knn) to
validate array inputs before mapping: if $knn is an array ensure it is a
non-empty list (no associative keys), and each element is an instance of Knn; if
validation fails throw InvalidException with a clear message like "setKnn
expects a non-empty list of Knn instances"; after validation proceed to map each
Knn to its ->toArray()['knn'] value (preserving existing mapping behavior) so
only valid Knn payloads are produced.
|
Argh, I missed the ping.
I was thinking the same. i would like to limit the high level objects but likely makes more sense. |
Adds an Elastica\Query\Knn class wrapping Elasticsearch's top-level
knnsearch clause, plus Query::setKnn() to attach a single Knn or a list of
Knn (multiple kNN searches in a single request) as a sibling of
query.pre-filters (addFilter), similarity threshold and boost.
Summary by CodeRabbit
New Features
Tests